Classification of Protein Interaction Sentences via Gaussian Processes

نویسندگان

  • Tamara Polajnar
  • Simon Rogers
  • Mark A. Girolami
چکیده

The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a nonparametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and näıve Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation

The non-parametric deterministic Support Vector Machines (SVMs) produce high levels of performances in text classification. This article offers a much needed evaluation of the Gaussian Process (GP) classifier, as a non-parametric probabilistic analogue to SVMs, which has been rarely applied to text classification. We provide an extensive experimental comparison of the performance and properties...

متن کامل

Joint Emotion Analysis via Multi-task Gaussian Processes

We propose a model for jointly predicting multiple emotions in natural language sentences. Our model is based on a low-rank coregionalisation approach, which combines a vector-valued Gaussian Process with a rich parameterisation scheme. We show that our approach is able to learn correlations and anti-correlations between emotions on a news headlines dataset. The proposed model outperforms both ...

متن کامل

The Rate of Entropy for Gaussian Processes

In this paper, we show that in order to obtain the Tsallis entropy rate for stochastic processes, we can use the limit of conditional entropy, as it was done for the case of Shannon and Renyi entropy rates. Using that we can obtain Tsallis entropy rate for stationary Gaussian processes. Finally, we derive the relation between Renyi, Shannon and Tsallis entropy rates for stationary Gaussian proc...

متن کامل

Complete convergence of moving-average processes under negative dependence sub-Gaussian assumptions

The complete convergence is investigated for moving-average processes of doubly infinite sequence of negative dependence sub-gaussian random variables with zero means, finite variances and absolutely summable coefficients. As a corollary, the rate of complete convergence is obtained under some suitable conditions on the coefficients.

متن کامل

Improved prediction of missing protein interactome links via anomaly detection

Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009